Goto

Collaborating Authors

 color value



NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation

arXiv.org Artificial Intelligence

Talking head generation based on the neural radiation fields model has shown promising visual effects. However, the slow rendering speed of NeRF seriously limits its application, due to the burdensome calculation process over hundreds of sampled points to synthesize one pixel. In this work, a novel Neural Light Dynamic Fields model is proposed aiming to achieve generating high quality 3D talking face with significant speedup. The NLDF represents light fields based on light segments, and a deep network is used to learn the entire light beam's information at once. In learning the knowledge distillation is applied and the NeRF based synthesized result is used to guide the correct coloration of light segments in NLDF. Furthermore, a novel active pool training strategy is proposed to focus on high frequency movements, particularly on the speaker mouth and eyebrows. The propose method effectively represents the facial light dynamics in 3D talking video generation, and it achieves approximately 30 times faster speed compared to state of the art NeRF based method, with comparable generation visual quality.


On-Device Unsupervised Image Segmentation

arXiv.org Artificial Intelligence

Along with the breakthrough of convolutional neural networks, learning-based segmentation has emerged in many research works. Most of them are based on supervised learning, requiring plenty of annotated data; however, to support segmentation, a label for each pixel is required, which is obviously expensive. As a result, the issue of lacking annotated segmentation data commonly exists. Continuous learning is a promising way to deal with this issue; however, it still has high demands on human labor for annotation. What's more, privacy is highly required in segmentation data for real-world applications, which further calls for on-device learning. In this paper, we aim to resolve the above issue in an alternative way: Instead of supervised segmentation, we propose to develop efficient unsupervised segmentation that can be executed on edge devices. Based on our observation that segmentation can obtain high performance when pixels are mapped to a high-dimension space, we for the first time bring brain-inspired hyperdimensional computing (HDC) to the segmentation task. We build the HDC-based unsupervised segmentation framework, namely "SegHDC". In SegHDC, we devise a novel encoding approach that follows the Manhattan distance. A clustering algorithm is further developed on top of the encoded high-dimension vectors to obtain segmentation results. Experimental results show SegHDC can significantly surpass neural network-based unsupervised segmentation. On a standard segmentation dataset, DSB2018, SegHDC can achieve a 28.0% improvement in Intersection over Union (IoU) score; meanwhile, it achieves over 300x speedup on Raspberry PI. What's more, for a larger size image in the BBBC005 dataset, the existing approach cannot be accommodated to Raspberry PI due to out of memory; on the other hand, SegHDC can obtain segmentation results within 3 minutes while achieving a 0.9587 IoU score.


Python Projects with Source Code - Practice Top Projects in Python - DataFlair

#artificialintelligence

Looking to build a career in Python? Want to improve your resume with multiple personal projects on it? Then this blog of Python projects with source code is for you. You earlier read about the top 5 data science projects; now, we bring you 12 projects implementing data science with Python. In this blog, you'll find the entire code to all the projects.


Photo Mosaics with Nearest Neighbors: Machine Learning for Digital Art

#artificialintelligence

Technological innovation is increasing at a rapid pace and has made digital storage extremely cheap and accessible. Additionally, most people now have phones with cameras that are able to capture high quality images. The majority of images taken are viewed a few times and then sent to sit on a hard drive or some cloud storage service. I am no different, and since I had some extra time during the COVID-19 lockdowns, I came up with some software to give the photos in people's libraries a second life. This software creates photo mosaics.


Top 10 Data Science Project Ideas for Beginners and Experts

#artificialintelligence

In the domain of artificial intelligence, data science has been a resonance for the last few years. As more industries and sectors are realizing the need for data science, more opportunities are finding their way. For this generation data science is providing the best career option. The demand for data scientists is continuously increasing in the market. For becoming a data scientist professional you can do some technical data science projects, this will help in boosting your career growth.


Chromatic and spatial analysis of one-pixel attacks against an image classifier

arXiv.org Artificial Intelligence

One-pixel attack is a curious way of deceiving neural network classifier by changing only one pixel in the input image. The full potential and boundaries of this attack method are not yet fully understood. In this research, the successful and unsuccessful attacks are studied in more detail to illustrate the working mechanisms of a one-pixel attack created using differential evolution. The data comes from our earlier studies where we applied the attack against medical imaging. We used a real breast cancer tissue dataset and a real classifier as the attack target. This research presents ways to analyze chromatic and spatial distributions of one-pixel attacks. In addition, we present one-pixel attack confidence maps to illustrate the behavior of the target classifier. We show that the more effective attacks change the color of the pixel more, and that the successful attacks are situated at the center of the images. This kind of analysis is not only useful for understanding the behavior of the attack but also the qualities of the classifying neural network.


Decomposing 3D Scenes into Objects via Unsupervised Volume Segmentation

arXiv.org Machine Learning

We present ObSuRF, a method which turns a single image of a scene into a 3D model represented as a set of Neural Radiance Fields (NeRFs), with each NeRF corresponding to a different object. A single forward pass of an encoder network outputs a set of latent vectors describing the objects in the scene. These vectors are used independently to condition a NeRF decoder, defining the geometry and appearance of each object. We make learning more computationally efficient by deriving a novel loss, which allows training NeRFs on RGB-D inputs without explicit ray marching. After confirming that the model performs equal or better than state of the art on three 2D image segmentation benchmarks, we apply it to two multi-object 3D datasets: A multiview version of CLEVR, and a novel dataset in which scenes are populated by ShapeNet models. We find that after training ObSuRF on RGB-D views of training scenes, it is capable of not only recovering the 3D geometry of a scene depicted in a single input image, but also to segment it into objects, despite receiving no supervision in that regard.


New MIT/Google algorithm retouches photos in real time

Daily Mail - Science & tech

The program is efficient enough to run on phones and is so fast that it can display retouched images in real-time, making it possible for users to see the final version of the image while still framing the shot. Researchers from Google and MIT's Computer Science and Artificial Intelligence Laboratory unveiled it this week at Siggraph, the premier digital graphics conference. The work builds on an earlier project from the MIT researchers that involved a similar process, but it occurred in the cloud. A phone would send a low-resolution version of an image to a web server, which would then send back a'transform recipe' that could be used to retouch the high-resolution version of the image on the phone, reducing bandwidth consumption. 'Google heard about the work I'd done on the transform recipe,' says Michaël Gharbi, an MIT graduate student in electrical engineering and computer science and first author on both the original and new papers.


Custom camouflage

AITopics Original Links

If a bulky electrical box has to be placed at the edge of a public park, what's the best way to conceal it so that it won't detract from its surroundings? At the conference on Computer Vision and Pattern Recognition in June, researchers from MIT and several other institutions take a first stab at answering these types of questions, with a new algorithm that can analyze photos of a scene, taken from multiple perspectives, and produce a camouflage covering for an object placed within it. The researchers developed a range of candidate algorithms and tested them using Amazon's Mechanical Turk crowdsourcing application, scoring them according to the amount of time volunteers took to locate camouflaged objects in synthetic images. Objects hidden by their best-performing algorithm took, on average, more than three seconds to find -- significantly longer than the casual glance the camouflage is intended to thwart. According to Andrew Owens, an MIT graduate student in electrical engineering and computer science and lead author on the new paper, the problem of disguising objects in a scene is, to some degree, the inverse of the problem of object detection, a major area of research in computer vision.